2,328 research outputs found

    A combined test for a generalized treatment effect in clinical trials with a time-to-event outcome

    Get PDF
    Most randomized controlled trials with a time-to-event outcome are designed and analyzed assuming proportional hazards of the treatment effect. The sample-size calculation is based on a log-rank test or the equivalent Cox test. Nonproportional hazards are seen increasingly in trials and are recognized as a potential threat to the power of the log-rank test. To address the issue, Royston and Parmar (2016, BMC Medical Research Methodology 16: 16) devised a new "combined test" of the global null hypothesis of identical survival curves in each trial arm. The test, which combines the conventional Cox test with a new formulation, is based on the maximal standardized difference in restricted mean survival time (RMST) between the arms. The test statistic is based on evaluations of RMST over several preselected time points. The combined test involves the minimum p-value across the Cox and RMST-based tests, appropriately standardized to have the correct null distribution. In this article, I outline the combined test and introduce a command, stctest, that implements the combined test. I point the way to additional tools currently under development for power and sample-size calculation for the combined test

    Power and sample-size analysis for the Royston–Parmar combined test in clinical trials with a time-to-event outcome

    Get PDF
    Randomized controlled trials with a time-to-event outcome are usually designed and analyzed assuming proportional hazards (PH) of the treatment effect. The sample-size calculation is based on a log-rank test or the nearly identical Cox test, henceforth called the Cox/log-rank test. Nonproportional hazards (non-PH) has become more common in trials and is recognized as a potential threat to interpreting the trial treatment effect and the power of the log-rank test—hence to the success of the trial. To address the issue, in 2016, Royston and Parmar (BMC Medical Research Methodology 16: 16) proposed a "combined test" of the global null hypothesis of identical survival curves in each trial arm. The Cox/logrank test is combined with a new test derived from the maximal standardized difference in restricted mean survival time (RMST) between the trial arms. The test statistic is based on evaluations of the between-arm difference in RMST over several preselected time points. The combined test involves the minimum p-value across the Cox/log-rank and RMST-based tests, appropriately standardized to have the correct distribution under the global null hypothesis. In this article, I introduce a new command, power_ct, that uses simulation to implement power and sample-size calculations for the combined test. power_ct supports designs with PH or non-PH of the treatment effect. I provide examples in which the power of the combined test is compared with that of the Cox/log-rank test under PH and non-PH scenarios. I conclude by offering guidance for sample-size calculations in time-to-event trials to allow for possible non-PH

    Augmenting the logrank test in the design of clinical trials in which non-proportional hazards of the treatment effect may be anticipated

    Get PDF
    © 2016 Royston and Parmar. Background: Most randomized controlled trials with a time-to-event outcome are designed assuming proportional hazards (PH) of the treatment effect. The sample size calculation is based on a logrank test. However, non-proportional hazards are increasingly common. At analysis, the estimated hazards ratio with a confidence interval is usually presented. The estimate is often obtained from a Cox PH model with treatment as a covariate. If non-proportional hazards are present, the logrank and equivalent Cox tests may lose power. To safeguard power, we previously suggested a 'joint test' combining the Cox test with a test of non-proportional hazards. Unfortunately, a larger sample size is needed to preserve power under PH. Here, we describe a novel test that unites the Cox test with a permutation test based on restricted mean survival time. Methods: We propose a combined hypothesis test based on a permutation test of the difference in restricted mean survival time across time. The test involves the minimum of the Cox and permutation test P-values. We approximate its null distribution and correct it for correlation between the two P-values. Using extensive simulations, we assess the type 1 error and power of the combined test under several scenarios and compare with other tests. We investigate powering a trial using the combined test. Results: The type 1 error of the combined test is close to nominal. Power under proportional hazards is slightly lower than for the Cox test. Enhanced power is available when the treatment difference shows an 'early effect', an initial separation of survival curves which diminishes over time. The power is reduced under a 'late effect', when little or no difference in survival curves is seen for an initial period and then a late separation occurs. We propose a method of powering a trial using the combined test. The 'insurance premium' offered by the combined test to safeguard power under non-PH represents about a single-digit percentage increase in sample size. Conclusions: The combined test increases trial power under an early treatment effect and protects power under other scenarios. Use of restricted mean survival time facilitates testing and displaying a generalized treatment effect

    Investigating treatment-effect modification by a continuous covariate in IPD meta-analysis: an approach using fractional polynomials

    Get PDF
    Background: In clinical trials, there is considerable interest in investigating whether a treatment effect is similar in all patients, or that one or more prognostic variables indicate a differential response to treatment. To examine this, a continuous predictor is usually categorised into groups according to one or more cutpoints. Several weaknesses of categorization are well known. To avoid the disadvantages of cutpoints and to retain full information, it is preferable to keep continuous variables continuous in the analysis. To handle this issue, the Subpopulation Treatment Effect Pattern Plot (STEPP) was proposed about two decades ago, followed by the multivariable fractional polynomial interaction (MFPI) approach. Provided individual patient data (IPD) from several studies are available, it is possible to investigate for treatment heterogeneity with meta-analysis techniques. Meta-STEPP was recently proposed and in patients with primary breast cancer an interaction of estrogen receptors with chemotherapy was investigated in eight randomized controlled trials (RCTs). Methods: We use data from eight randomized controlled trials in breast cancer to illustrate issues from two main tasks. The first task is to derive a treatment effect function (TEF), that is, a measure of the treatment effect on the continuous scale of the covariate in the individual studies. The second is to conduct a meta-analysis of the continuous TEFs from the eight studies by applying pointwise averaging to obtain a mean function. We denote the method metaTEF. To improve reporting of available data and all steps of the analysis we introduce a three-part profile called MethProf-MA. Results: Although there are considerable differences between the studies (populations with large differences in prognosis, sample size, effective sample size, length of follow up, proportion of patients with very low estrogen receptor values) our results provide clear evidence of an interaction, irrespective of the choice of the FP function and random or fixed effect models. Conclusions: In contrast to cutpoint-based analyses, metaTEF retains the full information from continuous covariates and avoids several critical issues when performing IPD meta-analyses of continuous effect modifiers in randomised trials. Early experience suggests it is a promising approach. Trial registration: Not applicable

    An approach to trial design and analysis in the era of non-proportional hazards of the treatment effect

    Get PDF
    Background: Most randomized controlled trials with a time-to-event outcome are designed and analysed under the proportional hazards assumption, with a target hazard ratio for the treatment effect in mind. However, the hazards may be non-proportional. We address how to design a trial under such conditions, and how to analyse the results. Methods: We propose to extend the usual approach, a logrank test, to also include the Grambsch-Therneau test of proportional hazards. We test the resulting composite null hypothesis using a joint test for the hazard ratio and for time-dependent behaviour of the hazard ratio. We compute the power and sample size for the logrank test under proportional hazards, and from that we compute the power of the joint test. For the estimation of relevant quantities from the trial data, various models could be used; we advocate adopting a pre-specified flexible parametric survival model that supports time-dependent behaviour of the hazard ratio. Results: We present the mathematics for calculating the power and sample size for the joint test. We illustrate the methodology in real data from two randomized trials, one in ovarian cancer and the other in treating cellulitis. We show selected estimates and their uncertainty derived from the advocated flexible parametric model. We demonstrate in a small simulation study that when a treatment effect either increases or decreases over time, the joint test can outperform the logrank test in the presence of both patterns of non-proportional hazards. Conclusions: Those designing and analysing trials in the era of non-proportional hazards need to acknowledge that a more complex type of treatment effect is becoming more common. Our method for the design of the trial retains the tools familiar in the standard methodology based on the logrank test, and extends it to incorporate a joint test of the null hypothesis with power against non-proportional hazards. For the analysis of trial data, we propose the use of a pre-specified flexible parametric model that can represent a time-dependent hazard ratio if one is present

    Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data

    Get PDF
    BACKGROUND: Prognostic studies of time-to-event data, where researchers aim to develop or validate multivariable prognostic models in order to predict survival, are commonly seen in the medical literature; however, most are performed retrospectively and few consider sample size prior to analysis. Events per variable rules are sometimes cited, but these are based on bias and coverage of confidence intervals for model terms, which are not of primary interest when developing a model to predict outcome. In this paper we aim to develop sample size recommendations for multivariable models of time-to-event data, based on their prognostic ability. METHODS: We derive formulae for determining the sample size required for multivariable prognostic models in time-to-event data, based on a measure of discrimination, D, developed by Royston and Sauerbrei. These formulae fall into two categories: either based on the significance of the value of D in a new study compared to a previous estimate, or based on the precision of the estimate of D in a new study in terms of confidence interval width. Using simulation we show that they give the desired power and type I error and are not affected by random censoring. Additionally, we conduct a literature review to collate published values of D in different disease areas. RESULTS: We illustrate our methods using parameters from a published prognostic study in liver cancer. The resulting sample sizes can be large, and we suggest controlling study size by expressing the desired accuracy in the new study as a relative value as well as an absolute value. To improve usability we use the values of D obtained from the literature review to develop an equation to approximately convert the commonly reported Harrell's c-index to D. A flow chart is provided to aid decision making when using these methods. CONCLUSION: We have developed a suite of sample size calculations based on the prognostic ability of a survival model, rather than the magnitude or significance of model coefficients. We have taken care to develop the practical utility of the calculations and give recommendations for their use in contemporary clinical research

    Life expectancy difference and life expectancy ratio: two measures of treatment effects in randomised trials with non-proportional hazards

    Get PDF
    The hazard ratio (HR) is the most common measure of treatment effect in clinical trials that use time-to-event outcomes such as survival. When survival curves cross over or separate only after a considerable time, the proportional hazards assumption of the Cox model is violated, and HR can be misleading. We present two measures of treatment effects for situations where the HR changes over time: the life expectancy difference (LED) and life expectancy ratio (LER). LED is the difference between mean survival times in the intervention and control arms. LER is the ratio of these two times. LED and LER can be calculated for at least two time intervals during the trial, allowing for curves where the treatment effect changes over time. The two measures are readily interpretable as absolute and relative gains or losses in life expectancy
    • …
    corecore